1001 Paraphrases: Incenting Responsible Contributions in Collecting Paraphrases from Volunteers
نویسنده
چکیده
A variety of applications can benefit from broad and detailed repositories of linguistic and world knowledge. An emerging approach to acquiring such repositories is to collect them from volunteer contributors. To increase the volume of contributions, some deployed systems for collecting volunteer-contributed knowledge offer recognition or prizes to those who provide the highest volume of contributions. However, rewarding for volume alone can encourage irresponsible contributions by unscrupulous participants. In this paper, we present an approach to collection from volunteers which incents responsible contributions. Rather than asking contributors to simply enter knowledge, our approach is to collect additional answers by asking contributors to guess partially obfuscated answers. To test the approach, we have implemented an online game, 1001 Paraphrases (http://aigames.org/paraphrase.html), and deployed it to collect 20,944 entries paraphrasing 400 statements. We present preliminary observations and lessons learned on the success of the approach.
منابع مشابه
External Plagiarism Detection based on Human Behaviors in Producing Paraphrases of Sentences in English and Persian Languages
With the advent of the internet and easy access to digital libraries, plagiarism has become a major issue. Applying search engines is one of the plagiarism detection techniques that converts plagiarism patterns to search queries. Generating suitable queries is the heart of this technique and existing methods suffer from lack of producing accurate queries, Precision and Speed of retrieved result...
متن کاملUCD-PN: Selecting General Paraphrases Using Conditional Probability
We describe a system which ranks humanprovided paraphrases of noun compounds, where the frequency with which a given paraphrase was provided by human volunteers is the gold standard for ranking. Our system assigns a score to a paraphrase of a given compound according to the number of times it has co-occurred with other paraphrases in the rest of the dataset. We use these co-occurrence statistic...
متن کاملChinese Whispers: Cooperative Paraphrase Acquisition
We present a framework for the acquisition of sentential paraphrases based on crowdsourcing. The proposed method maximizes the lexical divergence between an original sentence s and its valid paraphrases by running a sequence of paraphrasing jobs carried out by a crowd of non-expert workers. Instead of collecting direct paraphrases of s, at each step of the sequence workers manipulate semantical...
متن کاملA Class-oriented Approach to Building a Paraphrase Corpus
Towards deep analysis of compositional classes of paraphrases, we have examined a class-oriented framework for collecting paraphrase examples, in which sentential paraphrases are collected for each paraphrase class separately by means of automatic candidate generation and manual judgement. Our preliminary experiments on building a paraphrase corpus have so far been producing promising results, ...
متن کاملExtracting Paraphrases from a Parallel Corpus
While paraphrasing is critical both for interpretation and generation of natural language, current systems use manual or semi-automatic methods to collect paraphrases. We present an unsupervised learning algorithm for identification of paraphrases from a corpus of multiple English translations of the same source text. Our approach yields phrasal and single word lexical paraphrases as well as sy...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005